System and Methods For Short Rna Expression

ABSTRACT

The invention provides inducible expression systems for making short RNA transcripts that can be used in cells and transgenic animals for a variety of applications, including but not limited to, producing and studying the effects of RNAi and microRNA mediated gene silencing.

STATEMENT AS TO FEDERALLY SPONSORED RESEARCH

This invention was made with government support under grant number P01AI56900 awarded by NIH/NIAID.

TECHNICAL FIELD

This invention relates to technologies for regulating gene expression,and more particularly to inducible systems for expressing short RNAmolecules.

BACKGROUND

RNA interference (RNAi) is a powerful and widely used method to inhibitgene product expression in model organisms. RNAi is a highly coordinatedpost-transcriptional mechanism that was first described in nematodes. InRNAi, long double stranded RNAs and complex hairpin RNAs are processedinto small interfering RNAs (siRNAs). These siRNAs are generally 21-23bp RNA duplexes with characteristic dinucleotide overhangs. DuplexsiRNAs are processed by helicases into single stranded siRNAs, which areable to participate in RNA induced silencing complexes (RISC). The RISCcomplex functions as a highly specific endonuclease that usually cleavestarget RNAs with perfect complementarity to the siRNA in the RISCcomplex.

The power of RNAi as a tool lies in two features of the reaction justdescribed. First, siRNAs trigger a self-amplifying feedback loop thatrequires only a small number of initial siRNAs to potentially degrade alarge number of target RNAs. Cleavage of target RNAs by a RISC complexgenerates additional single stranded siRNAs, which in turn are able toparticipate in additional RISC complexes. Second, RNAi exhibitsexquisite specificity. A single base pair mutation in either the siRNA,or in the target RNA, typically prevents RNAi silencing of the targetRNA expression.

The power of siRNAs has fostered interest in the development of systemsthat can be used for RNAi-mediated silencing of pre-selected targetgenes in mammalian cells. Some systems employ chemical or enzymaticallysynthesized siRNAs to transiently induce RNAi in cells. Other systemsuse plasmid and viral vectors to express hairpin RNAs (siRNA-liketranscripts) to stably induce the knockdown of expression ofpre-selected genes. See, e.g., Brummelkamp, et al., Science 296:550-553(2002) and Novina, et al, Nat Med 8, 681-686 (2002); Rubinson, et al,Nat. Genet. 33:401-406 (2003). A third class of systems employtechnologies that allow for conditional expression of siRNA-liketranscripts. Czauderna, et al., Nucleic Acids Res 31:e12 (2003) andKasim, et al, Nucl. Acid. Res. Supp. No 3: 255-256 (2003).

SUMMARY OF THE INVENTION

The invention is based on novel expression systems that induciblyproduce short RNA transcripts. The short RNA expression systemsdescribed herein have the ability to inducibly and very precisely, e.g.,without extraneous sequence, produce short RNA transcripts, whosesequences can be pre-selected. These short RNA expression systems arevery well suited for expressing RNA transcripts that are designed toinduce gene silencing via any of the gene silencing mechanisms known tooperate through very short, and often highly specific, RNA molecules.The invention also provides transgenic animals and cells carrying theshort RNA expression systems disclosed herein. Because the systems ofthe present invention are inducible, they can be used to study the roleof essential genes in cells and animals in ways that are not possible inconstitutive expression systems. Additionally, the inducible expressionsystem of the present invention can be used to study the effects ofinduced gene silencing in specific tissues.

In general, the invention features a nucleic acid molecule that includesthe following sequence components: a promoter sequence capable oftranscribing short RNA transcripts, a short RNA encoding sequence thatencodes a short RNA transcript, and a STOP cassette.

Short RNA transcripts are transcripts with, e.g., fewer than 400 bases,or fewer than 201 bases, or fewer than 150 bases, or fewer than 100bases, or fewer than 50 bases. Short RNA transcripts include RNAmolecules capable of eliciting RNAi-mediated or micro-RNA-mediated genesilencing.

A STOP cassette includes the following sequence components: atermination sequence capable of preventing or terminating transcriptionby the RNA polymerase that binds the promoter sequence, a first loxPsequence, and a second loxP sequence. The loxP sequences flank thetermination sequence. The termination sequence is positioned along thenucleic acid between the promoter sequence and the transcriptioninitiation site of the short RNA encoding sequence in the nucleic acidmolecule. In some, but not all, embodiments the short RNA encodingsequence overlaps with one of the loxP sequences.

In a first aspect, the invention features a nucleic acid molecule thatincludes: an RNA polymerase III promoter sequence; a short RNA encodingsequence that includes a transcription initiation site; and a STOPcassette. The STOP cassette includes an RNA polymerase III-specifictermination sequence, a first loxP sequence and a second loxP sequence.The loxP sequences flank the termination sequence, and the terminationsequence is disposed between the promoter sequence and the transcriptioninitiation site of the short RNA encoding sequence in the nucleic acidmolecule. In some, but not all, embodiments the short RNA encodingsequence overlaps with one of the loxP sequences.

In some embodiments of the first aspect, the first loxP sequence is awild-type loxP sequence. In some embodiments of the first aspect, thesecond loxP sequence is the loxP that is downstream from the terminationsequence, and the second loxP is a mutant loxP sequence. For example,the second loxP sequence can contain sequence that overlaps with some orall of the short RNA encoding sequence. In other words, the n-terminalnucleotides in the terminus of the loxP that is proximal to the shortRNA consists of the 5′ terminal sequence of the short RNA encodingsequence, wherein n=1 to 10. In other examples of this embodiment, thefive terminal nucleotides in the loxP sequence overlap with, i.e.consist of, the five 5′ terminal nucleotides of the short RNA encodingsequence. The five 5′ terminal nucleotides of the short RNA encodingsequence is the sequence that includes the (+1) through (+5) positionsof the transcript encoding sequence.

In some embodiments of the first aspect, the nucleic acid includes athymidine nucleotide in the sequence position that immediately precedesthe upstream terminal sequence of the loxP sequence that is locatedupstream of the termination sequence. An example of this embodiment alsoincludes the wild-type first loxP sequence described above. Someexamples of this embodiment also include the mutant second loxPsequences described above, i.e. in which the n-terminal nucleotides inthe terminus of the loxP that is proximal to the short RNA consists ofthe 5′ terminal sequence of the short RNA encoding sequence, wherein n=1to 10.

In some embodiments of the first aspect, the promoter sequence includessome portion of the RNA polymerase III promoter sequence from thegenomic sequence of the small nuclear RNA U6 promoter. Examples of thisembodiment include nucleic acids with a STOP cassette that includes,from 1-190 bases of the genomic sequence that is immediately downstreamof the small nuclear RNA U6 genomic transcription termination signal. Inanother example of this embodiment, the STOP cassette of the nucleicacids include a modified genomic U6 transcription termination sequencethat includes: some number, from 1 to 20, inclusive, of additionalthymidine nucleotides disposed immediately adjacent to the wild-type U6thymidine termination signal (or T-stretch); and also includes somenumber, from 1 to 190, inclusive, of nucleotides encoding the wild-typeU6 genomic sequence that is immediately downstream of the thymidinetermination sequence. In some examples of this embodiment, thetermination sequence includes more than one T-stretch and also includessome number, from 1 to 190, inclusive, of nucleotides encoding thewild-type U6 genomic sequence that is immediately downstream of thethymidine termination sequence. Some examples of this embodiment alsoinclude a wild-type loxP sequence. Some examples of this embodiment alsoinclude the mutant loxP sequences described above, i.e. in which then-terminal nucleotides in the terminus of the loxP that is proximal tothe short RNA consists of the 5′ terminal sequence of the short RNAencoding sequence, wherein n=1 to 10.

In other embodiments of the first aspect, the short RNA encodingsequence encodes a transcript with fewer than 400, e.g., fewer than 200,fewer than 100, fewer than 70, fewer than 60, fewer than 50, fewer than40, or fewer than 30 nucleotides. Examples of this embodiment alsoinclude one or more of the following: any of the promoter sequences, anyof the termination sequences, the wild-type loxP sequence, or any of themutant loxP sequences that are described herein.

In a second aspect, the invention features a transgenic animal that hasincorporated into its genome any of the nucleic acids described herein,for example the nucleic acids described in the first aspect of theinvention.

In one embodiment, the transgenic animal also includes a nucleic acidmolecule encoding a Cre recombinase. In one example of this embodiment,expression of the Cre recombinase is developmentally regulated, e.g.,the Cre recombinase is maximally expressed only at one or more specificstages of embryonic or animal development. In another example of thisembodiment, expression of the Cre recombinase is tissue-specific, e.g.,the Cre recombinase is maximally expressed only in one or more specificcell types.

In some embodiments, the transgenic animal described herein is one ofthe following: a mouse, a rat, a goat, a pig, a monkey, a cow; a rabbit;a sheep, a hamster, a chicken, or a frog. In one example of thisembodiment, expression of the Cre recombinase is developmentallyregulated, e.g., the Cre recombinase is maximally expressed only at oneor more specific stages of embryonic or animal development. In anotherexample of this embodiment, expression of the Cre recombinase istissue-specific, e.g., the Cre recombinase is maximally expressed onlyin one or more specific cell types.

In a third aspect, the invention features a eukaryotic cell thatincludes any of the nucleic acids described herein, for example, thenucleic acids described in the first aspect of the invention. In oneembodiment, the cell is an animal cell, e.g., the cell is a mammaliancell. In another embodiment the cell is an embryonic stem cell.

In some embodiments, any of the cells described herein also includes anucleic acid molecule encoding a Cre recombinase gene. In otherembodiments, any of the cells described herein also include a Crerecombinase protein.

In a fourth aspect, the invention features a method of making aninducible short RNA expression system. The method includes linking twoor more nucleic acids to produce any one of the nucleic acids describedherein, e.g., the nucleic acids described in the first aspect of theinvention.

In a fifth aspect, the invention features a method of making atransgenic animal. In one embodiment, the method includes introducinginto the genome of an embryonic stem (ES) cell any of the nucleic acidmolecules described herein, e.g., the nucleic acids described in thefirst aspect of the invention, to generate a transgenic ES cell. Themethod also includes introducing the transgenic ES cell into an embryo,implanting the embryo into an animal capable of carrying the embryo toterm, and allowing the embryo to come to term, thereby generating atransgenic animal. In one example of this embodiment, the methodgenerates a chimeric transgenic animal, and the method further includescrossing the chimeric transgenic animal to another animal of the samespecies to generate a founder transgenic animal.

In another embodiment, the method includes introducing into the genomeof an oocyte any of the nucleic acid molecules described herein, e.g.,the nucleic acids described in the first aspect of the invention. Themethod also includes fertilizing the oocyte to produce an embryo,implanting the embryo in an animal capable of carrying the embryo toterm, and allowing the embryo to come to term, thereby generating atransgenic animal.

In a sixth aspect, the invention features a method of making an animalcell containing an inducible short RNA expression. The method includestransfecting a cell with any of the nucleic acid molecules describedherein, e.g., the nucleic acids described in the first aspect of theinvention. In an example of the method, the transfected cell is a cellfrom any one of the following animals: a human, a mouse, a rat, a goat,a pig, a monkey, a cow; a rabbit; a sheep, a chicken, a frog, or a fish.

In a seventh aspect the invention features a method of studying genefunction in a cell. The method includes: providing any of the cellsdescribed herein, e.g., the cells of the third aspect, inducingtranscription of the short RNA encoding sequence; and monitoring changesin the cell.

In an eighth aspect, the invention features a method of studying genefunction in an organism. The method includes: providing any of thetransgenic animals described herein, e.g., the transgenic animalsdescribed in the second aspect of the invention, inducing transcriptionof the short RNA encoding sequence; and monitoring changes in theorganism.

TERMS

“Short RNAs” and “short RNA transcripts” are ribonucleic acids,typically less than 400 bases in length. Some short RNAs are capable ofeliciting RNAi-mediated or Micro-RNA-mediated gene silencing.

“Short RNA encoding sequence” is a nucleic acid sequence coding for ashort RNA transcript. Typically a short RNA encoding sequence will be aDNA sequence coding for a short RNA transcript. A short RNA encodingsequence can also be an RNA sequence, e.g., in an RNA virus vector, thatencodes, e.g., by reverse transcription, a short RNA transcript.

“Transcription unit” is a nucleic acid that includes a promotersequence, a transcript sequence, and a transcript termination sequence.

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention belongs. Although methods and materialssimilar or equivalent to those described herein can be used in thepractice or testing of the present invention, suitable methods andmaterials are described below. All publications, patent applications,patents, and other references mentioned herein are incorporated byreference in their entirety. In case of conflict, the presentspecification, including definitions, will control. In addition, thematerials, methods, and examples are illustrative only and not intendedto be limiting.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1( a) and (b) are schematic diagrams of the U6lox-shA1 construct(a) before a Cre-mediated excision of the termination sequence, and (b)after a Cre-mediated deletion of the termination sequence.

FIG. 2( a) is a diagram of the targeting strategy for inserting theU6lox-shA1 construct into the HPRT locus of HM1 stem cells. FIG. 2( b)is a Southern Blot confirming insertion of the U6lox-shA1 construct.

FIG. 3 is a schematic diagram of the A1-IRES-EGFP reporter construct.

FIG. 4( a)-(d) are the results of experiments verifying the Cre-mediatedinduction of shA1 expression and subsequent specific downregulation ofthe A1-IRES-EGFP reporter construct.

Like reference symbols in the various drawings indicate like elements.

DETAILED DESCRIPTION

The following is a description of specific embodiments of the invention.The inducible short RNA expression systems and methods are described inconjunction with specific nucleic acid sequences. Nevertheless, itshould be recognized that the inducible expression system and methodsdescribed in the present specification and the claims may also be usedin conjunction with other nucleic acid sequences. Although the inducibleshort RNA expression systems are described as useful in methods thatregulate gene expression through RNAi and micro-RNA induced mechanisms,it should be recognized that the systems are also useful in othermethods, e.g. in applications that require the expression of short RNAsfor purposes other than RNA-mediated gene product regulation.

Brief View of the Novel Expression Systems

The components of the expression systems include an RNA Polymerase III(Pol III)-specific promoter sequence, a loxP-flanked STOP cassettesequence, and a short RNA encoding sequence. These three nucleotidesequences are arranged on a nucleic acid such that the promoter isupstream of the STOP cassette, and STOP cassette is upstream of theshort RNA encoding sequence. The terms upstream and downstream as usedherein refer to the direction of productive transcription on a nucleicacid molecule starting from the Pol III promoter's transcription startsite. Productive transcription starts from an upstream position on anucleic acid molecule and proceeds downstream along the molecule, untiltranscription is terminated. Thus, in the present systems, the short RNAencoding sequence is downstream of the STOP cassette, and the STOPcassette is downstream of the Pol III-specific promoter. The relativelocations of these three components in the present system preventstranscription of the short RNA encoding sequence by RNA polymerase IIIbecause the STOP cassette's termination sequence is located between thePol III promoter and the short RNA encoding sequence.

When Pol III polymerase assembles on the Pol III promoter sequence ofthe systems, it proceeds downstream from the promoter sequence towardsthe short RNA transcript. Before it reaches the short RNA encodingsequence, though, the polymerase encounters the termination sequence inthe STOP cassette. The termination sequence causes the polymerase toabort the transcription reaction before any short RNA encoding sequenceis transcribed.

Transcription of short RNA transcripts in the systems can be induced bycausing the nucleic acid to be contacted by a Cre recombinase. Crerecombinase can catalyze the excision of the STOP cassette from thenucleic acid, thereby producing a nucleic acid that no longer contains atranscription termination signal between the promoter sequence and theshort RNA encoding sequence. Cre-mediated excision of the STOP cassettein the present systems modifies the nucleic acids of the systemsdisclosed herein to allow Pol III promoter driven transcription of theshort RNA encoding sequence.

Detailed View of the Nucleic Acids of the Novel Short RNA ExpressionSystems 1. Promoter Sequences

Promoters that can be used in the short RNA expression system of thepresent invention are nucleic acids that include a promoter sequencecapable of driving expression of short RNAs, e.g., RNAs which can induceRNAi or micro-RNA mediated gene silencing. Preferred promoters are thosewhose transcription start and stop sites are very predictable andprecise. Examples of such promoters are the RNA polymerase III (PolIII)-specific promoters, which include the Pol III type 3 corepromoters, which are described in detail in Schramm and Hernandez, Genes& Dev. 16:2593-2620 (2002). Pol III promoter sequences are DNA sequencesthat recruit Pol III, i.e. on which Pol III can assemble inside of acell, for the first step of a Pol III transcription reaction.

Promoters that can be used in the present invention can include the U6snRNA gene (U6) promoter sequence. The U6 gene is transcribed by Pol IIIand encodes the U6 snRNA component of the splicesosome. The U6 promotersequence can be the U6 promoter sequence from a mammal, including ahuman or a mouse, or it can be the U6 promoter sequence from anon-mammalian animal. Other Pol III promoters that can be used in thepresent invention include promoter sequences that drive transcription ofthe III RNAse P gene (H1). The H1 promoter sequence can be the sequenceof the H1 promoter from a human, a mouse, a mammal, or an animal.

The U6 and H1 Pol III transcription units share several unusualfeatures. First, none of the promoter elements, except the (+1)transcription start site, is located in the transcribed region of eitherthe U6 or H1 gene. This feature means that almost any pre-selectedsequence can be placed downstream the U6 or H1 promoter start site, andPol III will drive expression of that sequence. Second, Pol IIIpromoters, e.g., the U6 and H1 promoters, start transcription fromprecisely defined distances, i.e., between 32 and 25 bp, downstream ofthe TATA box. This feature provides the necessary control for theexpression of short pre-selected transcripts. Third, Pol III recognizesa run of 4-5 thymidine residues as a termination signal. This featurenot only allows for easy control of transcript termination, but alsoresults in overhanging uridines, which resembles the overhanginguridines or thymidines at the end of synthetic siRNAs. Finally it isworth noting that Pol III normally transcribes only very short genes,generally less than 400 bp.

2. The STOP Cassette

The STOP cassettes of the present invention are nucleic acids. Thenucleotide sequence of these nucleic acids includes: a transcriptiontermination sequence and two loxP sequences. The two loxP sequencesflank the termination sequence, i.e., one loxP is positioned at the 5′terminus of the termination sequence, (i.e. upstream of the terminationsequence) and the other loxP is positioned at the 3′ terminus of thetermination sequence (i.e. downstream of the termination sequence).

The choice of termination sequence used in a STOP cassette will dependon the polymerase activity the STOP cassette is designed to terminate.Thus, if the promoter sequence used in a system of the present inventionis a Pol III promoter sequence, then the termination sequence used inthe system is a sequence capable of preventing or terminating Pol IIItranscription. If the promoter sequence is one that recruits anotherkind of polymerase, then the transcription termination sequence of theSTOP cassette is a sequence capable of preventing or terminatingtranscription of that other kind of polymerase that is recruited by thepromoter.

The Pol III polymerase is unique in its ability to recognize a simplerun of four to five consecutive thymidines as a termination signal(T-stretch). Schramm and Hernandez, Genes & Dev. 16:2593-2620 (2002).Transcription termination can be enhanced by including multipleT-stretches at the end of a Pol III transcribed gene. Transcriptiontermination can also be enhanced by increasing the number of consecutivethymidines in a T-stretch. Furthermore, reports have also suggested thatuntranscribed sequence downstream of the termination signal can affectthe termination efficiency of Pol III termination signal. Das, et al,EMBO J. 7:503-512 (1988).

When a Pol III promoter is used in a system of the present invention,appropriate termination sequences for use in the system can be sequencesthat include a run of four to five consecutive thymidines. Thetermination sequence can optionally include more than 5 consecutivethymidines. The termination sequence can optionally includeuntranscribed downstream sequences from known genomic Pol IIItermination signals.

For example, when a Pol III promoter is used in a system of the presentinvention, the termination sequence can include sequences that aredownstream of the genomic U6 termination signal. The terminationsequence can include any number, from 50 to 190, of bases of thewild-type genomic U6 sequence that is downstream of the U6 gene'sT-stretch.

Other examples of termination sequences that can be used in conjunctionwith a Pol III promoter sequence in systems of the present invention caninclude sequences that are downstream of the H1 termination signal. Thetermination signal can include any number, from 20 to 190, of bases ofthe wild-type H1 sequence that is downstream of the H1 gene's T-stretch.

The loxP sequences in the STOP cassette can include wild-type loxPsequences or one or two mutant lox P sequences. Wild-type LoxP sequencesare 34 base pair (bp) sequences that are recognized by the Crerecombinase in reactions described more fully below. A wild-type loxPsequence is consists of two 13 bp inverted repeats separated by an 8 bpspacer region. The loxP sequence has been published and is also providedin the Example below. See, e.g., Sauer, B., Nucl. Acids Res.24:4608-4613 (1996). It is worth noting that to be functional, awild-type loxP sequence must be on a double stranded DNA molecule. Thesystems of the present invention are not limited to double stranded DNAmolecules. For example, the present invention contemplates the use ofretroviruses that carry sequences coding for a promoter, a loxP-flankedterminator sequence, and a short RNA encoding sequence. Suchretroviruses might be used to insert DNA molecules in the genome of ahost, thereby generating a functional inducible expression system. Theterms “wild-type loxP” sequence or “mutant loxP sequence” thereforeshould also be understood to include single stranded DNA sequences andRNA sequences coding for functional DNA loxP sequences.

In some embodiments the expression system of the present invention willinclude one mutant loxP sequence. The mutant loxP sequence can be theloxP sequence that is upstream or the loxP sequence that is downstreamof the termination sequence in the STOP cassette. Some mutant loxPsequences will contain one or more mutated bases in the terminal 10bases of one terminus of a loxP sequence. The terminus of a loxPsequence refers to one of the two 5′ and 3′ ends of the loxP sequence.Thus every loxP in a STOP cassette contains two termini, an upstream anda downstream terminus relative to the direction of productivetranscription generated by the promoter sequence in the system. Theterminal 10 bases of a loxP terminus are the ten consecutive bases thatconstitute one of the two termini of a loxP sequence.

In some embodiments the mutated loxP sequence will include one or moremutant bases in the downstream terminus of the loxP sequence that isdownstream of the termination sequence. Examples of such mutants areloxP mutants are loxP sequences that contain one or more mutation in the10 bases of the downstream terminus. In some examples the mutantdownstream loxP terminal sequence will overlap with the first 1-10,e.g., 5, bases of the short RNA encoding sequence. In other words thedownstream terminal sequence, of the loxP sequence located downstream ofthe termination sequence, can include, or overlap with, the upstreamterminal sequence of the short RNA encoding sequence. The usefulness ofsuch mutant loxPs is explained below.

3. Short RNA Encoding Sequences

The short RNA encoding sequences of the present invention are nucleicacid sequences coding for short RNA transcripts. Short RNA transcriptsare transcripts consisting of 120 nucleotides or less. Short RNAencoding sequences include those that code for siRNA-like hairpins,which can be between 10 and 40 nucleotides in length. In some systemsshort RNA encoding sequences encode transcripts that are between 15 and30 nucleotides in length. In some systems short RNA encoding sequencesencode transcripts that are between 18 and 24 nucleotides in length.Many short RNA encoding sequences include sequences coding fortranscripts that can activate a cell's RNAi gene silencing mechanisms.

Short RNA transcripts also include micro-RNA-like precursors and microRNA-like transcripts. Micro-RNA precursors can be approximately 70nucleotides in length. Lee et al, EMBO J. 21:4663-4670 (2002). ProcessedMicro RNAs can be much smaller, e.g., from 10-40 nucleotides long, or15-30 nucleotides long, or most frequently between 18-24 nucleotideslong. Micro-RNAs mediate gene-silencing through a different mechanismthan RNAi. Unlike siRNAs MicroRNAs are not usually perfectlycomplementary to their targets. short RNA encoding sequences in thepresent system include sequences coding for transcripts that activate acells micro-RNA mediated gene-silencing mechanisms.

In keeping with standard molecular biological usage, the firstnucleotide of the short RNA transcript is encoded by the transcriptioninitiation (+1) site of the short RNA encoding sequence. Thetranscription initiation site is therefore upstream of every othernucleotide in the short RNA encoding sequence. The second nucleotide inthe short RNA encoding sequence that is transcribed can be referred toas the (+2) position, and the third nucleotide in a developingtranscript is coded for by the (+3) position in the short RNA encodingsequence, etc.

In some embodiments the upstream portion of the short RNA encodingsequence overlaps with the closest, i.e., proximal loxP sequence in thenucleic acid. (The proximal loxP to the short RNA encoding sequence isthe downstream loxP relative to the other loxP in the system). In theseembodiments the downstream terminal sequence of the short RNA encodingsequence-proximal loxP sequence is the upstream sequence of the shortRNA encoding sequence. Stated differently, the downstream terminalsequence of the downstream loxP contains the transcription initiationsite of the short RNA encoding sequence, and optionally includes one ormore bases of additional short RNA encoding sequence.

In some embodiments of the system, the 10 terminal bases of thedownstream terminal of the downstream loxP sequence are also the +1through +10 positions of the short RNA encoding sequence. In otherembodiments 5 terminal bases of the downstream terminal of thedownstream loxP sequence are also the +1 through +5 positions of theshort RNA encoding sequence.

Termination of transcription of the short RNA encoding sequences isachieved by placing a termination signal immediately downstream of theshort RNA encoding sequence. In the present system, the most downstreamportion the short RNA encoding sequence will contain the first one, two,or three thymidines of the stretch of consecutive thymidines thatrepresents a Pol III termination signal.

Functional Equivalents

Skilled artisans will recognize that functional equivalents can be usedin place of certain sequences described herein, in conjunction with theinducible expression systems disclosed herein. For example, in oneembodiment, a functional equivalent can be used instead of the mousegenomic U6 promoter sequence provided in Table 2 of Example 1.Functional equivalents of the mouse U6 promoter sequence includesequences that differ by one or more bases from the sequence provided inTable 2 and that retain an ability to recruit RNA polymerase III in thefirst step of a reaction that leads to productive RNA transcription.Similarly, the functional equivalent of any other Pol III promotersequence, e.g. the human genomic U6 promoter sequence, the human ormouse genomic H1 promoter sequences, include sequences that differ byone or more bases from the Pol III promoter sequences and also retain anability to recruit Pol III in the first step of a reaction that leads toproductive transcription.

Functional equivalents can also be used instead of genomic sequencesdownstream of a Pot III termination signal.

Functional equivalent sequences include those sequences that also have ahigh percentage of identity to the sequences already known to skilledartisans and/or those sequences disclosed herein that can be used inconjunction with the expression systems of the present invention.Functional equivalents include sequences with 99%, 98%, 97%, or anypercentage higher than 90%, or any percentage higher than 80%, or anypercentage higher than 70%, identity to a known or disclosed sequence.

To determine the percent identity of two nucleic acid sequences, thesequences are aligned for optimal comparison purposes (e.g., gaps can beintroduced in one or both of a first and a second nucleic acid sequencefor optimal alignment and non-homologous sequences can be disregardedfor comparison purposes). In a preferred embodiment, the length of areference sequence aligned for comparison purposes is at least about30%, preferably at least about 40%, more preferably at least about 50%,even more preferably at least about 60%, and even more preferably atleast about 70%, 80%, 90%, or 100% of the length of the referencesequence. The nucleotides at corresponding nucleotide positions are thencompared. When a position in the first sequence is occupied by thenucleotide as the corresponding position in the second sequence, thenthe molecules are identical at that position (as used herein nucleicacid “identity” is equivalent to nucleic acid “homology”). The percentidentity between the two sequences is a function of the number ofidentical positions shared by the sequences, taking into account thenumber of gaps, and the length of each gap, which need to be introducedfor optimal alignment of the two sequences.

The comparison of sequences and determination of percent identitybetween two sequences can be accomplished using a mathematicalalgorithm. In a preferred embodiment the percent identity between twonucleotide sequences is determined using the GAP program in the GCGsoftware package (available at http://www.gcg.com), using aNWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and alength weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set ofparameters (and the one that should be used if the practitioner isuncertain about what parameters should be applied to determine if amolecule is within a sequence identity or homology limitation of theinvention) are a Blossum 62 scoring matrix with a gap penalty of 12, agap extend penalty of 4, and a frameshift gap penalty of 5.

The percent identity between nucleotide sequences can be determinedusing the algorithm of E. Meyers and W. Miller (CABIOS, 4:11-17 (1989))which has been incorporated into the ALIGN program (version 2.0), usinga PAM120 weight residue table, a gap length penalty of 12 and a gappenalty of 4.

The nucleic acid and protein sequences described herein can be used as a“query sequence” to perform a search against public databases to, forexample, identify other family members or related sequences. Suchsearches can be performed using the NBLAST and XBLAST programs (version2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLASTnucleotide searches can be performed with the NBLAST program, score=100,wordlength=12 to obtain nucleotide sequences homologous to 26493 nucleicacid molecules of the invention. To obtain gapped alignments forcomparison purposes, Gapped BLAST can be utilized as described inAltschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. Whenutilizing BLAST and Gapped BLAST programs, the default parameters of therespective programs (e.g., XBLAST and NBLAST) can be used. Seehttp://www.ncbi.nlm.nih.gov.

Methods of Making the Nucleic Acid of the Present Invention

Techniques and methods for engineering recombinant nucleic acids arewell known in the art. Examples of such techniques and methods include,enzymatic nucleotide restrictions, site directed mutagenesis, and invitro transcription.

Methods of Using the Nucleic Acids of the Present Invention

The nucleic acids of the present invention can be placed inside livingcells and organisms. For example, the nucleic acids of the presentinvention can be placed in nucleic acid vectors which are subsequentlyintroduced into hosts by a variety of methods which are known in theart, e.g., transformation, transfection, electroporation, and liposomedelivery. Examples of vectors include plasmids, phages, cosmids,phagemids, yeast artificial chromosomes (YAC), bacterial artificialchromosomes (BAC), human artificial chromosomes (HAC), viral vectors,such as adenoviral vectors, retroviral vectors, and other DNA sequenceswhich are able to replicate or to be replicated in vitro or in a hostcell, or to convey a desired DNA segment to a desired location within ahost cell.

Examples of organisms that can be hosts for vectors carrying the nucleicacid of the present invention include bacteria, yeast, flies, nematodes,animals and mammals. Examples of cells that can be hosts to vectorscarrying the nucleic acids of the present invention include cellsavailable from the American Type Culture Collection (ATCC) (Manassas,Va.).

Transgenic Animals

In some embodiments of the invention the nucleic acids of the disclosedexpression system are integrated into the genome of transgenic animals.Transgenic animals can be generated by introducing the nucleic acidsdisclosed herein into the germline of an animal. Methods for introducingnucleic acids into the germline of animals and generating transgenicanimals, e.g. chimeric transgenics or founder lines of transgenics, areknown in the art. See, e.g., Torres, R. M. and Kuhn, R., LaboratoryProtocols for Conditional Gene Targeting, Qxford University Press,Oxford, U.K. (1997) and Nagy, et al., Manipulating the Mouse Embryo: ALaboratory Manual (Third Edition) Cold Springs Harbor Laboratory Press,Woodbury, N.Y. (2003). The Example provided below describes theintroduction of a nucleic acid containing an inducible SHORT-RNAexpression system into mouse embryonic stem cells.

Additional techniques that can be used to produce the founder lines oftransgenic animals include, but are not limited to, pronuclearmicroinjection (U.S. Pat. No. 4,873,191), retrovirus mediated genetransfer into germ lines (Van der Putten et al., Proc. Natl. Acad. Sci.,USA 82:6148, 1985), gene targeting into embryonic stem cells (Thompsonet al., Cell 56:313, 1989); and electroporation of embryos (Lo, Mol.Cell. Biol. 3:1803, 1983). For a review of techniques that can be usedto generate and assess transgenic animals, skilled artisans can consultGordon (Intl. Rev. Cytol. 115:171-229, 1989), and may obtain additionalguidance from, for example: Hogan et al. “Manipulating the Mouse Embryo”(Cold Spring Harbor Press, Cold Spring Harbor, N.Y., 1986; Krimpenfortet al., Bio/Technology 9:86, 1991; Palmiter et al., Cell 41:343, 1985;Kraemer et al., “Genetic Manipulation of the Early Mammalian Embryo,”Cold Spring Harbor Press, Cold Spring Harbor, N.Y., 1985; Hammer et al.,Nature 315:680, 1985; Purcel et al., Science, 244:1281, 1986; Wagner etal., U.S. Pat. No. 5,175,385; and Krimpenfort et al., U.S. Pat. No.5,175,384.

Methods of Inducing the Inducible Expression System

When the nucleic acids described herein are introduced into eukaryotichost cells, the host cell's RNA polymerase III (Pol III) is recruited tothe Pol III promoter sequence of the nucleic acid. The promoter cannot,however, initiate transcription of the short RNA encoding sequence,because of the STOP cassette that is located between the Pol IIIpromoter and the short RNA encoding sequence. When Pol III polymerasebegins moving downstream of the promoter, the polymerase encounters thetermination sequence in the STOP cassette and aborts transcriptionbefore short RNA transcript synthesis begins.

Induction of short RNA expression in the system described herein isachieved by exposing the expression system to a Cre recombinase. Theability of Cre recombinase to excise loxP-flanked sequences of DNA hasbeen extensively described. See, e.g., Guo, et al, Nature 389, 40-46(1997) and Lakso, et al, Proc Nat'l Acad. Sci. USA 89, 6232-6236 (1992).Briefly, Cre recombinase recognizes loxP sites flanking a DNA sequenceand either excises or inverts the DNA sequence between the two loxPsites. Although, loxP sequences contain two inverted 13 bp repeats, the8 spacer nucleotides are not palindromic and provide loxP sites with anorientation. Excision occurs between two loxP sites oriented in the samedirection, while inversion occurs between loxP sites that are orientedin opposite directions. A Cre-mediated excision reaction removes all theDNA between the two original loxP sites and leaves behind one loxPsequence.

In the present system, a Pol III termination sequence is flanked by twoloxP sequences. Thus, in the present system, a Cre-mediated excisionresults in the removal of the DNA that encodes the Pol III terminationsignal. After removal of the termination signal, Pol III is free to bindthe Pol III promoter sequence of the expression system and transcribethe short RNA encoding sequence that is downstream of the promoter.Having removed the termination signal of the STOP cassette, thereremains only one loxP sequence between the promoter sequence and theshort RNA encoding sequence, thereby allowing for transcription of theshort RNA encoding sequence.

Optimizing Short RNA Expression

In applications such as the synthesis of siRNA-like and micro-RNA-likegene silencing, the exact transcript sequence generated by the short RNAencoding sequence can be very important. For example, single base pairmutations can abolish the ability of a transcript to induce RNAi. It isalso undesirable to include extraneous sequence in a short RNAtranscript, as the extraneous sequence can also abolish gene silencing.Therefore a short RNA expression system should include features thateliminate unwanted mutations or extraneous sequence in the short RNAtranscript.

The fact that Cre-mediated recombination leaves behind one 34 base pairloxP sequence between the promoter sequence and the short RNA encodingsequence can create a problem. Since Pol III promoters starttranscription from between 32 and 25 base pairs downstream of the TATAbox, it will frequently not be desirable to locate the TATA box of thepromoter sequence upstream of the loxP site that is proximal to thepromoter sequence. If the TATA box is placed upstream of thepromoter-proximal loxP site, then Pol III transcription site, i.e. the(+1) site will be located inside the loxP sequence that remains after aCre-mediated excision.

This problem can be minimized by taking advantage of the fact that the5′ end of a loxP site has the following sequence: 5′-Adenine, Thymidine,Adenine-3′ (ATA). By introducing a thymidine reside immediately upstreamof the loxP site that is proximal to the promoter sequence, a functionalTATA box is produced that will remain after a Cre-mediated recombinationevent in the expression system.

Nonetheless, transcription can still start within the loxP even thoughthe TATA box includes the first three nucleotides of the loxP site. Forexample, the transcription start site of the U6 promoter is 26 basepairs downstream of the TATA box. In an inducible expression systemmodified so that the TATA box includes the first three nucleotides ofthe remaining loxP sequence, a U6 promoter sequence will causetranscription to begin within the loxP sequence, i.e., such a transcriptwill include sequence encoded by the downstream terminal 5 bases of theloxP sequence.

To drive the expression of short RNA transcripts that do not begin withthe terminal 5 bases of the loxP sequence, the present inventionrecognizes that the loxP sequence that is proximal to the short RNAencoding sequence can be mutated, so that after a recombination event,the system expresses short RNA transcripts that do not include wild-typeloxP sequence. Thus, as shown in the Example below, the terminal 5 basepairs of the loxP sequence that is distal from the promoter can bemutated to encode the first 5 bases of the desired short RNA transcript.The mutation effectively creates an overlap of the mutant loxP sequenceand the short RNA encoding sequence. The mutation described in theExample did not affect recombination efficiency and produced atranscript capable of inducing gene silencing.

This strategy can be generalized and adapted to different promoters anddifferent pre-selected short RNA transcript. Once the distance from theTATA box to a transcription start site has been determined for a givenPol III promoter, the transcription start site within a remaining loxPin an expression system using that promoter can be predicted. Thedownstream terminal residues of the downstream loxP site in the systemcan then be mutated so that the mutant loxP sequence encodes the firstone or more bases of a pre-selected short RNA encoding sequence, that isthe downstream mutant loxP sequence and the upstream short RNA encodingsequence overlap. In this manner the system can be adapted to produce avariety of exact short RNA transcripts that do not necessarily includewild-type loxP sequence.

Methods of Using the Inducible Expression System

The inducible expression system disclosed herein can be used inconditional, loss-of-function genetic studies in animals and cells. Forexample, transgenic animals whose genomes incorporate the expressionsystem described herein can be crossed with transgenic animals carryingthe Cre recombinase gene under the control of a temporally or spatiallyregulated promoter. Temporally regulated promoters are developmentallyregulated promoters that turn on gene expression at specific stages ofembryonic or animal development. Spatially regulated promoters arepromoters that turn on gene expression only in defined cellular oranatomical locations, e.g., tissue-specific promoters. Many such strainsof Cre transgenic mice have been developed that carry a Cre transgeneunder the control of a developmentally-regulated or tissue-specificpromoter. One notable source of such strains is The Jackson Laboratory,Barr Harbor Me.

Even a single transgenic mouse line whose genome harbors the inducibleexpression systems of the present invention can be crossed with avariety of regulated Cre-expressing transgenic mice to create a varietyof double transgenic mice, which are suitable for use in manyconditional, loss-of-function studies. These double transgenic lines canbe used to study the effects of knocking down expression of a targetgene in individual tissues, e.g., to study the effects of knocking downexpression of a target gene only in neural tissue or only in specificcell types. The effect of knocking down the expression of essentialtarget genes in adult animals can be studied using double transgenicsthat contain a developmentally-regulated Cre gene that is only expressedin the adult animal. Similarly the role of a gene during differentstages of development can be studied by using different doubletransgenic mice that carry the same short RNA expression construct, butdifferent Cre transgenic constructs that express the Cre gene atdifferent stages of development.

The expression systems described herein can also be used to study theeffects of knocking down multiple gene products expressed by multiplegenes, which share some genetic sequence identity. For example, anexpression system coding for only one siRNA-like molecule can be used todown regulate expression of more than one gene product, provided thosegenes share an identical siRNA target sequence. Thus, a single nucleicacid expression system, or an organism or a cell carrying one suchnucleic acid, of the present invention can be used to study the role ofgene products from multiple gene family members, provided each member ofthe gene family shares some sequence identity with the other gene familymembers at the target site of for the short RNA that is induciblyexpressed by the nucleic acid. The Example provided below discloses anexpression system designed to produce a single short RNA transcript thatdown regulates several members of the A1 group of genes in the bcl-2family of genes.

The expression system of the present invention can also be used inconjunction with other methods of conditionally delivering Crerecombinase to animals or cells harboring the nucleic acids disclosedherein. For example, cells transformed or transfected with theexpression system can be exposed to exogenous Cre recombinase. The Creprotein can be delivered into the cells using any reagent suitable forthe delivery of protein into a cell, e.g., liposomes or electroporation.Delivery of the Cre protein into the cell can thereby induce therecombination event that allows expression of the short RNA encodingsequence.

The inducible short RNA expression system disclosed herein is a powerfultool for conducting conditional loss of gene function experiments.Animals or cells harboring the nucleic acids disclosed herein can beinduced to express the short RNA coded for by the nucleic acids, andchanges in these animals or cells can be monitored. The types of changesthat can be monitored include, but are not limited to, physiologicalchanges, molecular biological changes, biochemical changes, changes ingenetic expression, histological changes, gross anatomical changes,behavioral changes, changes in viability, changes in morbidity, andchanges in mortality. Other changes that can be monitored includechanges in compound-mediated effects on a cell or on an organism, e.g.,changes in drug efficacy and/or changes in any other drug-induced effector side effect.

EXAMPLE

The example provides DNA construct for the inducible production ofshRNAs that target the A1a, A1b, and A1d genes of the bcl-2 family. Theconstruct was inserted into the genome of mouse embryonic stem cells,shRNA transcription was induced, and the construct was shown toselectively knockdown the expression of an A1-fusion reporter gene.(shRNAs are short hairpin RNAs that can be degraded to siRNAs thatactivate RNAi)

The Construct: U6lox-shA1

The construct used in this example included a U6 promoter sequence, aloxP-flanked STOP cassette, and an shRNA sequence. The STOP cassetteincluded the U6 transcription termination sequence. The U6 terminationsequence consisted of the wild-type run of consecutive thymidines, i.e.the T-stretch, and 190 bp of genomic DNA downstream of the T-stretch. Anadditional T-stretch was inserted next to the endogenous U6 T-stretch toenhance the efficiency of transcriptional termination of the STOPcassette. Insertion of the loxP-flanked STOP cassette between the U6promoter and the shRNA gene required several adjustments in order toensure proper shRNA transcription upon Cre-mediated deletion of the STOPsequence. Transcriptional initiation at (+1) is crucial for the precisegeneration of short RNAs by RNA Pol III. Deletion of the STOP cassetteleaves only one loxP site at the site of its integration. If the STOPcassette were inserted after the (+1), this would result in a loxP-shRNAfusion transcript, which could interfere with proper shRNA processingand siRNA generation. To avoid transcription of the loxP site, it had tobe integrated into the U6 promoter between the TATA box and (+1).Mutational analysis of the Pol III promoter suggested that this sequencecould be altered without affecting the efficiency of Pol III-mediatedtranscription. Myslinski et al., Nucleic Acids Res. 29:2502-2509.(2001). However, since the (+1) site is located 26 bp downstream of theU6 TATA box and one loxP site comprises 34 bp, accommodation of a loxPsite in the U6 promoter required the following adjustments: the first 3bp of the loxP site (ATA) were integrated into the TATA box and the last5 bp of the shRNA-proximal loxP site was exchanged for the first 5 bp ofthe shRNA coding sequence. A 5 bp mutation at the distal end on theinverted repeat was not expected to dramatically decrease recombinationefficiency. FIG. 1 shows a schematic view of the inducible construct.

The entire construct is referred to as the U6lox-shA1 cassette orU6lox-shA1 construct. The shRNA sequence is referred to as, shA1, sinceit is directed against the bcl-2 family members A1a, A1b and A1d. Uponexpression of shA1, the RNAi-processed RNA transcript produced isreferred to herein as siA1.

The U6lox-shA1 cassette was cloned in three steps. First, the modifiedU6 promoter was PCR amplified from the U6 promoter containing plasmidpU6 (Sui, et al., Proc. Nat'l Acad. Sci. USA 99:5515-5520 (2002) usingprimers XbaI-U6 and U6lox-T-RI (see Table I). The 5′ primer introducedan XbaI site 5′ of the U6 promoter, the 3′ primer replaced the sequence3′ of the TATA box with a loxP site, two T-stretches, and an EcoRI site.Second, the Pol III termination sequence was PCR amplified from C57BL/6genomic DNA using primers U6termRI and U6term1B (see Table I), whichintroduced a 5′EcoRI site and a 3′ BamHI site. Third, a fragmentconsisting of a mutant loxP site fused to shA1 was generated byoligonucleotide synthesis of two complimentary oligomers, lox-shA1-s andlox-shA1-as (see FIG. 1 for sequence information). The annealed oligomercontained a 5′ BamHI site and a 3′ HindIII site. The three subfragmentsfrom each of the steps listed above were cloned into a modifiedpBS-polylinker resulting in an AscI-flanked U6lox-shA1 construct. Thesequence of the U6lox-shA1 is shown in Table 2.

FIG. 1 (a) shows the U6lox-shA1 construct before a Cre-mediated excisionof the termination sequence, and FIG. 1 (b) shows the U6lox-shA1construct after a Cre-mediated deletion of the termination sequence.Triangles are loxP sites, the STOP rectangle is U6 termination sequencecomprising two T-stretches and 190 bp of wild-type genomic sequenceimmediately downstream of the genomic U6 T-stretch, U6lox is a modifiedU6 promoter sequence containing a loxP site. Sequence from TATA box tothe T-stretch following shA1 sequence is shown below each construct (theomitted wild-type U6 termination sequence is marked with “STOP”). Thedistance from the 3′ end of the TATA box is to the shRNA transcriptioninitiation site (+1) is 26 bp. FIG. 6 shows the overlap between TATA andthe 5′ end of the loxP sequence, and it also shows the overlap betweenthe upstream 5 bp of the shA1 encoding sequence and shA1 proximalterminus of the mutant loxP.

Insertion of the Construct into HPRT Deficient HM1 Embryonic Stem Cells

As a first step in generating transgenic mouse strain that allowsubiquitous induction of shA1-mediated RNAi upon Cre-mediatedrecombination in a defined genetic locus, the U6lox-shA1 construct wastargeted into the X-linked hypoxanthine phosphoribosyltransferase (HPRT)locus by homologous recombination in ES cells. This approach takesadvantage of the fact that HPRT-deficient HM-1 ES cells permit extremelyefficient selection of transgenes inserted into the HPRT locus.Thompson, et al., Cell 56:313-321 (1989). HM-1 ES cells lack the HPRTpromoter and exons 1 and 2. Only reconstitution of the disrupted HPRTlocus by gene targeting confers resistance to HAT selection. Hence,virtually every HAT-resistant ES cell colony carries the targeted HPRTallele. A targeting vector that allows the insertion of transgenes intoHM-1 ES cells has been described previously (pMP-8SKB, (Bronson et al.,1996)). A modified version of this vector referred to as pMP-10, hasbeen developed, which can be linearized with SwaI, SbfI or SgfI andharbors two additional unique restriction sites (AscI and PmeI) toinsert the transgene of choice. The U6lox-shA1 cassette was insertedinto the AscI restriction site of pMP-10 in the same transcriptionalorientation as the HPRT gene. The targeting vector was linearized withSwaI and transfected into HM-1 ES cells.

The targeting strategy is shown in FIG. 2 a. FIG. 2 a shows a partialrestriction map of the HPRT wild-type genomic locus (HPRT WT), belowwhich is a partial restriction map of the HM1 mutant HPRT genomic locus(HM1), and below both of which, is a partial restriction map showing theinsertion, i.e. Knock-In, of the U6lox-shA1 construct into the HM1mutant HPRT locus (U6lox-shA1 KI). HPRT exons are shown as boxes withroman numerals above them. StuI restriction sites are marked by acapital S.

FIG. 2 b is a Southern Blot confirming insertion of the U6lox-shA1construct. The integrity of HAT-resistant colonies was confirmed bySouthern blotting using a StuI digest and probe RSA. RSA is shown inFIG. 2 a above the general location of its binding site near HPRT exonIII. Two independent ES cell clones were injected into C57BL/6blastocysts.

Testing the Construct

The ability of the construct to effect Cre-mediated induction of shRNAexpression and subsequently knock down A1 expression was tested intransgenic ES cells. Endogenous A1 expression is barely detectable in EScells. Therefore, to increase measurable A1 signal, a transgene encodingan A1-IRES-EEGFP reporter protein was introduced into targeted ES cells.A1 cDNA was fused to DNA containing an internal ribosomal entry site(IRES) followed by EEGFP cDNA. Expression of this fusion constructresults in a bicistronic mRNA encoding A1 and EEGFP. Degradation of thisconstruct by siA1-mediated mRNA degradation was predicted to result inloss of both A1 and EEGFP expression.

The coding sequence of the mouse A1d gene was PCR-amplified from spleniccDNA using primers A1d-X and A1d-B (see Table I), which introduced a 5′XhoI site and a 3′ BamHI site. The PCR fragment was then subcloned intoBamHI/XhoI-digested pIRES2-EEGFP (Clontech, Palo Alto, Calif.) togenerate the A1-IRES-EGFP fusion construct.

To test the sequence specificity of siA1, a second, mutated A1expression construct (mutA1-IRES-EGFP) was cloned into pIRES2-EEGFP. ThemutA1 cDNA contains 6 conservative mutations at the siA1 target site(see FIG. 3) and was generated by PCR amplification using primers A1d-Xand mutA1d-B (see Table I). A1-IRES-EGFP constructs were subcloned intothe neoR selectable marker containing expression vector pCXN2. Niwa, etal., Gene 108:193-199 (1991).

A1-IRES-EGFP, mutA1-IRES-EGFP and IRES-EGFP fragments were excised fromthe respective pIRES2-EEGFP vectors using XhoI and NotI and insertedinto an XhoI site 3′ of the chicken b-actin promoter of pCXN2.Expression vectors were SalI-linearized and transfected into U6lox-shA1ES cells. Stable integrants were selected with G418 starting 2 daysafter transfection. Single G418-resistant ES cell colonies were analyzedfor EGFP expression in order to confirm expression of the reportertransgene.

FIG. 3 depicts the three fusion constructs used to verify specificRNAi-mediated gene silencing by shA1. A1 box represents the A1 cDNAsequence, the IRES box represents the internal ribosome entry sitesequence, EEGFP box represents the EEGFP gene, and pA box represents thepolyadenylation (poly A) site from the pCNX2 expression vector. ThemutA1 box represents the mutated A1 cDNA; gray letters in the sequencebelow the mutA1 box indicate mutated bases. The siA1 box representspredicted product of RNAi processed shA1 transcript, the siA1 box isdepicted above the siA1 target site.

Cre-Mediated Induction of RNAi in ES Cells

EGFP+ clones of each transgenic ES cell line were transduced with a Creexpressing adenovirus in order to delete the loxP-flanked STOP cassetteand induce shRNA expression. See, e.g., Bassing, et al, Cell 109Suppl:S45-55 (2002). Untransduced cells served as negative control.Seven days after transduction, ES cells were analyzed for EGFPexpression by FACS analysis.

FIG. 4A shows that only ES cell clones that were exposed to Cre andcarried the perfectly complementary A1-IRES-EGFP transgene showeddownregulation of EGFP expression, demonstrating sequence-specific andinducible RNAi in U6lox-shA1 ES cells. FIG. 4A depicts the results ofFACS analysis of EGFP expression in transduced (open histograms) oruntransduced ES cells (shaded histograms). The respective EGFP transgeneis indicated, and AV-Cre stands for Cre expressing adenovirus.

The fact that EGFP downregulation occurred only in ˜60% of cells likelyreflects incomplete deletion of the STOP cassette. This was confirmed byPCR analysis of genomic DNA isolated from total cell lysate orsubpopulations that were sorted according to EGFP expression levels.Deletion of the STOP cassette occurred exclusively in cells showing EGFPdownregulation, i.e. the EGFP-low cells, as shown in FIG. 4B.

FIG. 4B shows a schematic of the targeted HPRT locus. Half-arrows depictprimers hHPRT-pro and HPRT-SAH (see Table I) flanking the insertedU6lox-shA1 cassette. The arrow represents the human HPRT promoter, thegray box depicts human exon 1, the white box mouse exon 2; map is notdrawn to scale. PCR results are shown for transduced and untransduced EScells transgenic for IRES-EGFP (IRES), A1-IRES-EGFP (A1) ormutA1-IRES-EGFP (mutA1). A1-IRES-EGFP transgenic ES cells were sortedaccording to EGFP expression levels. DNA from EGFPhigh cells and EGFPlowcells was subjected to PCR. The expected sizes for PCR fragments before(U6-STOP-A1) and after deletion of the Pol III STOP cassette (U6-A1) areindicated. The asterisk indicates a fragment resulting from a DNA hybridof one U6-STOP-A1strand and one U6-A1 strand.

Importantly, FIGS. 9C and 9D show that similar levels of Cre-mediateddeletion and concomitant siRNA generation were detected in allCre-treated ES cell lines, emphasizing the specificity of siA1 forA1-IRES-EGFP mRNA. To determine the extent of mRNA degradation, EGFPcontaining mRNA levels were analyzed by Northern blotting using a probespecific for EEGFP. The results of Northern blot analysis of transducedand untransduced ES cells carrying the indicated transgene are shown inFIGS. 9C and 9D. 20 mg of total RNA were loaded per lane. Syntheticdouble-stranded siRNA of identical sequence were loaded in the amountsindicated above the siA1 lanes of FIG. 4C to estimate siRNA expressionlevels. The size of the detected mRNA differed depending on the presenceor absence of the A1 cDNA. Detection of GAPDH mRNA served to normalizefor loading differences. EGFP mRNA levels were strongly reduced in totalcell lysate and the remaining mRNA is likely to originate from cellsthat have not undergone deletion of the STOP cassette. Indeed, whencells were sorted according to EGFP expression, A1-IRES-EGFP mRNA wasbarely detectable in EGFPlow cells and image quantification showed a >10fold reduction of mRNA when compared to EGFPhigh cells. No mRNAreduction could be observed in untransduced A1-IRES-EGFP transgenic EScells or in IRES-EGFP control samples. These data demonstrate that asingle copy of the U6lox-shA1 cassette mediates efficient,sequence-specific and tightly regulated suppression of A1 in vitro.

TABLE 1 Primers for polymerase chain reaction (PCR) Name Sequence(5′-3′) Location → LAH53 GGACCTCCATCTGCTCTTATTT 5′ of DQ52 s* CDR3-PEGGTCTATTACTGTGCAAGTTGG CDR3 of VPE as U6termRITGTGAATTCGTTCCTCAGAGGAACTGA 3′ of U6 gene s U6term1-BTGTGGATCCCCCGGGCGTGGCTTGGTGGTACACCTC 3′ of U6 gene as XbaI-U6GACTCTAGATCCGACGCCGCCATCTCTAG U6 promoter s U6loxT-RITGCGAATTCAAAAATCGCAAAAACGTAATAACTTCGTATA U6 promoter asAGTATGCTATACGAAGTTATAGTCTCAAAACACACAATTA CTTAC A1d-XTGCTCGAGATGTCTGAGTACGAGTTCATGCATATC A1d cDNA s mutA1-BCTGGATCCTTATTTCAGCAGGAACAGCATCTCCCATATCT A1d cDNA as G A1d-BCTGGATCCTTACTTGAGGAGAAAGAGCATTTC A1d cDNA as HPRT-SAHTTCCTAATAACCCAGCCTTTG pMP-10 SAH s hHPRT-pro GTGATGGCAGGAGATTTGTAA hHPRTpromoter as Abbreviations in Table 1: s, sense strand; as, antisensestrand

TABLE 2 Sequence of the U6lox-shA1 construct5′-tccgacgccgccatctctaggcccgcgccggccccctcgcacagacttgtgggagaagctcggctactcccctgccccggttaatttgcatataatatttcctagtaactatagaggcttaatgtgcgataaaagacagataatctgttctttttaatactagctacattttacatgataggcttggatttctataagagatacaaatactaaattattattttaaaaaacagcacaaaaggaaactcaccctaactgtaaagtaattgtgtgttttgagactataacttcgtatagcatacattatacgaagttattacgtttttgcgatttttgaattcgttcctcagaggaactgacaagcaccctaacatcctattggaggctcactcacgttttttctattttgtttcttgacagcagagctcgttgctcactgtatagctcaggttggcctgacactgatgaggttctccagtgactgcctctacctacctactgggatgacagaggtgtaccaccaagccacgcccgggggatccataacttcgtatagcatacattatacgaaggaaatgctctttctcctcaaagctttgaggagaaagagcatttcccttttt-3′

The nucleotide sequence in Table 2 encodes the following functionalunits (numbering begins at the 5′ end):

1-282: U6 promoter upstream of TATA box

283-287: TATA box

284-317: loxP

318-530: termination sequence starting with additional TTTT

543-577: mutant loxP

572-end: shA1 hairpin plus T-stretch

OTHER EMBODIMENTS

It is to be understood that while the invention has been described inconjunction with the detailed description thereof, the foregoingdescription is intended to illustrate and not limit the scope of theinvention, which is defined by the scope of the appended claims. Otheraspects, advantages, and modifications are within the scope of thefollowing claims.

1. A nucleic acid molecule comprising: an RNA polymerase III promotersequence; a short RNA encoding sequence comprising a transcriptioninitiation site; a loxP-flanked STOP cassette comprising an RNApolymerase III-specific termination sequence, a first loxP sequence, anda second loxP sequence, wherein (i) each of the two loxP sequencescomprises a spacer region, (ii) the termination sequence is disposedbetween the first and second loxP sequences, and (iii) the terminationsequence is disposed between the promoter sequence and the transcriptioninitiation site of the short RNA encoding sequence in the nucleic acidmolecule.
 2. The molecule of claim 1, wherein each of the loxP sequencescomprises one or more mutations in its spacer region.
 3. The molecule ofclaim 1, wherein the first loxP sequence is a wild-type loxP sequence.4. The molecule of claim 1, wherein the second loxP sequence is a mutantloxP sequence.
 5. The molecule of claim 1, wherein the second loxPsequence is closer to the short RNA encoding sequence that the firstloxP sequence; the second loxP sequence comprises a distal terminalsequence and a proximal terminal sequences, wherein the spacer region isdisposed between the distal and the proximal terminal sequence, thedistal terminal sequence is closer to the termination sequence than thespacer region, and the proximal terminal sequence is closer to the shRNAencoding sequence than spacer region; the second loxP proximal terminalsequence overlaps with 1 to 10 nucleotides of 5′ end of the short RNAencoding sequence; and the 1 to 10 nucleotides of the 3′end of thesecond loxP proximal terminal sequence consists of the 5′end of theshort RNA encoding sequence.
 6. The molecule of claim 1, furthercomprising a thymidine nucleotide immediately preceding the upstreamterminal sequence of the first loxP, wherein the first loxP is upstreamof the termination sequence.
 7. The molecule of claim 1, wherein the RNApolymerase III promoter sequence comprises genomic sequence of the smallnuclear RNA U6 promoter or a functional equivalent thereof.
 8. Themolecule of claim 7, wherein: the termination sequence comprises genomicsequence downstream of the small nuclear RNA U6 transcriptiontermination signal.
 9. The molecule of claim 8, wherein the terminationsequence is a modified U6 transcription termination sequence comprising:between 1 to 20, inclusive, additional thymidine nucleotides disposedimmediately adjacent to the wild-type U6 thymidine termination signal;and between 1 to 190, inclusive, additional nucleotides of animalgenomic sequence that is immediately downstream of the thymidinetermination sequence of wild-type small nuclear RNA U6 gene.
 10. Themolecule of claim 8, wherein the termination sequence further comprisesone or more additional RNA Polymerase III termination signals.
 11. Themolecule of claim 1, wherein the short RNA encoding sequence encodes atranscript with fewer than 30 nucleotides.
 12. The molecule of claim 1,wherein the molecule comprises a sequence selected from the groupconsisting of: SEQ ID NOs: 1 to
 7. 13. A transgenic animal whose genomecomprises the nucleic acid molecule of claim
 1. 14. The transgenicanimal of claim 13, further comprising a nucleic acid molecule encodinga Cre recombinase.
 15. The transgenic animal of claim 14, whereinexpression of the Cre recombinase is developmentally regulated.
 16. Thetransgenic animal of claim 13, wherein expression of the Cre recombinaseis tissue-specific.
 17. The animal of claim 13, wherein the animal isselected from the group consisting of a mouse, a rat, a guinea pig, agoat, a pig, a monkey, a baboon, a chimpanzee, a cow, a rabbit, a sheep,a dog, a cat, a hamster, a chicken, and a frog.
 18. A eukaryotic cellcomprising the nucleic acid molecule of claim
 1. 19. The cell of claim18, wherein the cell is an animal cell.
 20. The cell of claim 18,wherein the cell is a mammalian cell.
 21. The cell of claim 19, whereinthe cell is an embryonic stem cell.
 22. The cell of claim 18, furthercomprising a nucleic acid molecule encoding a Cre recombinase gene. 23.The cell of claim 18, further comprising a Cre recombinase protein. 24.A method of making an inducible short RNA expression system, the methodcomprising linking two or more nucleic acids to produce the nucleic acidof claim
 1. 25. A method of making a transgenic animal comprising:introducing the molecule of claim 1 into the genome of an embryonic stemcell; introducing the embryonic stem cell into an embryo; implanting theembryo in an animal capable of carrying the embryo to term; and allowingthe embryo to come to term, thereby generating a transgenic animal. 26.The method of claim 25, wherein: the molecule of claim 1 is introducedinto the genome of an oocyte; the oocyte is fertilized to produce anembryo; the embryo is implanted in an animal capable of carrying theembryo to viability; and the embryo is allowed to become a viableanimal, thereby generating a founder transgenic animal.
 27. The methodof claim 25, wherein the method generates a chimeric transgenic animal,and further comprising: crossing the chimeric transgenic animal toanother animal of the same species to generate a founder transgenicanimal whose genome includes the molecule of claim
 1. 28. A method ofmaking an animal cell containing an inducible short RNA expression, themethod comprising: transfecting a cell with the molecule of claim
 1. 29.The method of claim 28, wherein the cell is a cell from any one of thefollowing animals: a human, a mouse, a rat, a guinea pig, a goat, a pig,a monkey, a baboon, a chimpanzee, a cow; a horse, a rabbit; a sheep, achicken, a dog, a cat, a frog, or a fish.
 30. A method of evaluatinggene function in a cell, the method comprising: providing the cell ofclaim 18; inducing transcription of the short RNA encoding sequence; andmonitoring changes in the cell.
 31. A method of evaluating gene functionin an organism, the method comprising: providing the transgenic animalof claim 13; inducing transcription of the short RNA encoding sequence;and monitoring changes in the organism.
 32. A method of treating apatient, the method comprising: administering the molecule of claim 1into a patient in need of having expression of one or more genesreduced, wherein the short RNA encoding sequence encodes a transcriptdesigned to reduce expression of the one or more genes the patient is inneed of reducing.
 33. The method of claim 32, wherein the methodcomprises administering the molecule in the cell of claim
 18. 34. Amethod of identifying a candidate RNAi effector with reduced activity inT-cells, the method comprising: administering or inducing expression ofsiRNA in a T-cell and a control cell; evaluating expression of an mRNAsor protein in the T-cell and the control cell; and identifying an mRNAor protein (a) with a reduced expression level or (b) that isdifferently modified in the T-cell relative to control, wherein thecontrol cell is not a mature lymphocyte and an mRNA or protein withreduced levels or that is differently modified in the T-cell relative tocontrol is a candidate RNAi effector with reduced activity in T-cells.35. A method of identifying a candidate inhibitor of RNAi in T-cells,the method comprising: administering or inducing expression of siRNA ina T-cell and a control cell; evaluating expression of an mRNA or proteinin the T-cell and the control cell; and identifying an mRNA or protein(a) with an increased expression level or (b) that is differentlymodified in the T-cell relative to control; wherein the control cell isnot a mature lymphocyte and an mRNA or protein with reduced levels orthat is differently modified in the T-cell relative to control is acandidate inhibitor of RNAi in T-cells.
 36. A method of identifying amissing RNAi effector or inhibitor of RNAi in T-cells, the methodcomprising: identifying a candidate missing RNAi effector or candidateinhibitor of RNAi by performing the method of claim 34; and (i) in oneor more T-cells, (a) introducing the identified candidate RNAi effectoror (b) modifying the identified candidate RNAi effector, andsubsequently determining if (a) or (b) increases RNAi efficiency in theone or more T-cells, wherein an increases RNAi efficiency is an RNAieffector with reduced activity in T-cells; (ii) introducing or modifyingthe identified candidate inhibitor of RNAi in a cell, and subsequentlydetermining if it reduces RNAi efficiency in the cell, wherein acandidate that reduces RNAi efficiency in the cell is an inhibitor ofRNAi in T-cells; or (iii) inactivating the identified candidateinhibitor in a T-cell, and subsequently determining if inactivationincreases RNAi efficiency in the T-cell, wherein an inactivatedcandidate inhibitor that increases RNAi efficiency in the T-cell is aninhibitor of RNAi in T-cells.